-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/auditable behavior #12
base: main
Are you sure you want to change the base?
Conversation
7517893
to
0d11e67
Compare
e9689a5
to
5dfd469
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12 +/- ##
============================================
+ Coverage 68.56% 69.58% +1.01%
- Complexity 8065 8305 +240
============================================
Files 232 259 +27
Lines 24550 25467 +917
============================================
+ Hits 16832 17720 +888
- Misses 7718 7747 +29
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
5dca523
to
540d035
Compare
540d035
to
a086023
Compare
Hi Moritz, following up on your last comment in our discussion (propelorm#1983 (comment)) on the main repo - one way to check back for manual DB edits could be accomplished with triggers. The triggers on the audit table (BEFORE and maybe also AFTER) instantiate a communication with Propel. Propel will then know if there have been any manual CRUD on the audio table. As Propel should basically only ever do INSERT operations on the audit table (new row) if I understand you right - Propel should hand over a UUID or hash (each maybe salted with environment specific data) that is valid only for the time of the specific operation Propel does on the audit table. This value should be saved in another row field of the audit table (so you have two fields on every insert row in the audit table) and the UUID should maybe also be updated on each operation on the original table. The UUID can basically function as a lock value. On any conducted operation on the DB, the triggers will give back the value to Propel. An additional value to check would be the connection id which should also be unique. Communication with Propel can be done for example via a User Defined Function to call Propel. Although UDFs are a bit hacky and seem to be not evenly supported across different DB systems. Another option to call Propel might be a service broker. If it’s basically not feasible at all to let the DB call Propel by itself there would still be the option to check the UUID of the row(s) of the audit table everytime Propel does a DB operation. Then the last UUIDs (and connection ids?) should probably be stored somewhere (encrypted?) by Propel to have a reference. One could also compare with the log files of the DB (if activated). In this way you could in theory alert Propel for external changes in the DB. Additionally, one could try to never release the write lock on the audit table. Yet this is prone to fail with connection timeouts as the lock will be released after a certain timeout I believe. All in all, each of the ideas provided above seem to be not 100% reliable/feasible yet a combination of them could accomplish a quite good safety measure for dealing with external DB edits. As mentioned in propelorm#1983 (comment), there would also still be the possibility to move the audit table completely to a dedicated DB which uses the archive engine to prevent UPDATE and DELETE operations. I hope, these ideas can be a way to make your auditable behavior more reliable. All the best, |
Addendum to my last option above: While doing a somewhat unrelated Google query for „propel orm leadership“, I stumbled upon an elegant solution Propel supports to work with several DBs at once. I wonder how Google sometimes creates its results. Maybe because I did wildly search for different Propel related stuff today. 😄 Anyways: Propel allows to directly access different DBs within one schema.xml even with FKs across the different DBs which are then directly callable as related objects. One has has just to declare a schema parameter for the core DB within the schema.xml and than a foreignSchema parameter for every foreign key that lies within in a different DB, e.g. a DB with the archive engine and audit table. It is also possible to define the foreign DB directly within the same schema.xml. Example (in German) given here: https://ansas-meyer.de/programmierung/mysql/propel-fremdschluessel-mehrere-datenbanken/ I hope that adds another puzzle piece towards sketching a solution. Best, |
Creates a table storing audit information for the table holding the behavior.
CRUD operations on the audited table will automatically create entries in an automatically generated audit table, describing the change.
Parameters
The behavior accepts most of the parameters from synced table behavior (see #10). Additionally, auditable behavior uses these parameters:
audit_id
audited_at
audit_event
changed_values
JSON
null
null
null
BLOB, CLOB
changed
null
true
|false
false
true
|false
false
This implementation is somewhat specific, in that it tries to avoid duplicating the whole row data on input. Subsequent updates store the overridden values, which allows to rebuilt the row history backwards from the current row data. This approach has benefits, but it comes at a cost, particularly its fragility to manual row updates without audits.
The problem with manual updates that don't add audits are not just unaccounted, but they lead to wrong audits. Currently, the stored data of an audit would look something like this:
The recorded values are the ones being overridden by the update, which is why no values need to be recorded on insert. If current value of the fruit column is "orange", you would get this audit:
However, the actual events with manual update might have been:
So instead of showing unaccounted changes, the audit just claims that the users inserted data which they did not insert. This is very dicey, if acceptable at all.
Having to rebuild the audit after reading it from DB adds another layer of possible mistakes and errors.
The clean approach would be to store the new values, instead of the overridden ones:
possibly even the old values if you want to have at least some idea about manual updates:
With this, the reported user input is always correct, even if the values do not match the row values due to manual updates.
However, this requires to duplicate all values on input, even when you know that most values will never change.
So if you know that there are no manual updates (or those updates also add audits), the first approach is more efficient, as it allows for faster inserts and requires less space.
Looking back, going the easy route would have spared me time and nerves, but I really didn't want to make full duplicates (I am dealing with rather large text columns that typically won't change after insert).
But with all the caveats and sensibilities, I am not sure if this works in a general case.